XR Adaptive Modality: Experiment Report

Author

Mohammad Dastgheib

Published

December 8, 2025

1. Executive Summary

This report analyzes 7 participants performing Fitts’ law pointing tasks across two input modalities (Hand, Gaze) and two UI modes (Static, Adaptive).

Key Findings

  • Total Trials Analyzed: 1306 valid trials (correct responses, RT 150-6000ms)
  • Total Trials Collected: 1485
  • Overall Error Rate: 12%
  • Mean Throughput: 2.91 bits/s (SD = 0.74)
  • Mean Movement Time: 1.277s (SD = 0.516s)

2. Demographics

Overall Demographics

N Mean Age SD Age Age Range Mean Gaming (Hrs/Week) SD Gaming
7 28 8.8 18 - 42 0.9 1.2

By Gender

gender Count Avg Age SD Age Avg Gaming (Hrs)
female 1 28 NA 0
male 6 28 9.6 1

Input Device Distribution

input_device Count Percentage
mouse 7 100

3. Primary Analysis: Throughput

Research Question: Does the Adaptive UI improve performance (Throughput) compared to Static, especially for Gaze?

Summary Statistics

Throughput (bits/s) by Condition
modality ui_mode pressure N Mean SD Median Q25 Q75
hand static 1 21 3.16 0.69 3.22 2.72 3.60
hand adaptive 1 21 3.19 0.73 3.22 2.57 3.81
gaze static 1 21 2.72 0.61 2.72 2.38 3.03
gaze adaptive 1 21 2.57 0.74 2.52 2.02 3.10

Visualizations

Throughput by Modality and UI Mode. Higher values indicate better performance. White diamonds show mean values. Violin plots show the distribution shape, boxplots show quartiles, and individual points may be visible as outliers.

Statistical Model Results

⚠ **Statistical model could not be fitted for Throughput.**

**Reason:**  Model fitting failed: contrasts can be applied only to factors with 2 or more levels 

**Diagnostics:**
- Participants:  7 
- Total trials:  84 
- Conditions:  4 
- Minimum trials per condition:  21 
- Empty conditions:  0 

**Trials per condition:**
# A tibble: 4 × 4
  modality ui_mode  pressure     n
  <fct>    <fct>    <fct>    <int>
1 hand     static   1           21
2 hand     adaptive 1           21
3 gaze     static   1           21
4 gaze     adaptive 1           21

4. Movement Time Analysis

Research Question: How does movement time vary across conditions?

Summary Statistics

Movement Time (s) by Condition
modality ui_mode pressure N Mean SD Median
hand static 1 358 1.195 0.391 1.119
hand adaptive 1 362 1.160 0.296 1.110
gaze static 1 285 1.334 0.501 1.215
gaze adaptive 1 301 1.460 0.757 1.212

Visualizations

Movement Time by Modality and UI Mode. Lower values indicate faster performance. White diamonds show mean values. Violin plots show the distribution shape, boxplots show quartiles, and individual points may be visible as outliers.

Statistical Model Results

⚠ **Statistical model could not be fitted for Movement Time.**

**Reason:**  Model fitting failed: contrasts can be applied only to factors with 2 or more levels 

**Diagnostics:**
- Participants:  7 
- Total trials:  1306 
- Conditions:  4 
- Minimum trials per condition:  285 
- Empty conditions:  0 

**Trials per condition:**
# A tibble: 4 × 4
  modality ui_mode  pressure     n
  <fct>    <fct>    <fct>    <int>
1 hand     static   1          358
2 hand     adaptive 1          362
3 gaze     static   1          285
4 gaze     adaptive 1          301

5. Fitts’ Law Modelling

Research Question: How well does the data fit Fitts’ Law? (Linearity check). Flatter slopes indicate less sensitivity to difficulty (ballistic movement).

Fitts’ Law Regression (Movement Time vs Effective Index of Difficulty). The effective index of difficulty (IDe) is calculated using the effective target width (We) derived from the spatial distribution of selection endpoints. Shaded regions around regression lines represent 95% confidence intervals. Linear regression fits are shown separately for each modality and UI mode combination.

### Model Fit Statistics
Linear Regression: MT ~ IDe
modality ui_mode r_squared slope intercept
hand static 0.709 0.190 0.464
hand adaptive 0.820 0.154 0.576
gaze static 0.617 0.240 0.468
gaze adaptive 0.464 0.357 0.198

6. Error Rate Analysis

Research Question: How do error rates differ across conditions?

Error Rates by Condition
modality ui_mode pressure Total Errors Error_Rate
hand static 1 378 20 5.29
hand adaptive 1 378 16 4.23
gaze static 1 351 66 18.80
gaze adaptive 1 378 77 20.37

Error Rate by Modality and UI Mode. Lower values indicate fewer errors. Error rate is calculated as the percentage of trials with incorrect selections (misses, timeouts, or false activations). White diamonds show mean values. Violin plots show the distribution shape, boxplots show quartiles.

Error Rate by Modality and UI Mode. Lower values indicate fewer errors. Error rate is calculated as the percentage of trials with incorrect selections (misses, timeouts, or false activations). White diamonds show mean values. Violin plots show the distribution shape, boxplots show quartiles.

Statistical Model Results

⚠ **Statistical model could not be fitted for Error Rate.**

**Reason:**  Model fitting failed: contrasts can be applied only to factors with 2 or more levels 

**Diagnostics:**
- Participants:  7 
- Total trials:  1485 
- Conditions:  4 
- Minimum trials per condition:  351 
- Empty conditions:  0 
- Overall error rate:  12.1 %

**Trials per condition:**
# A tibble: 4 × 4
  modality ui_mode  pressure     n
  <fct>    <fct>    <fct>    <int>
1 hand     static   1          378
2 hand     adaptive 1          378
3 gaze     static   1          351
4 gaze     adaptive 1          378

7. Accuracy & Gaze Dynamics

Effective Width (\(W_e\))

Lower \(W_e\) indicates tighter shot grouping (higher precision).

Effective Width (px) by Condition
modality ui_mode pressure Mean_We SD_We
hand static 1 34.07 20.05
hand adaptive 1 35.47 22.51
gaze static 1 36.91 18.32
gaze adaptive 1 37.80 17.75

Effective Target Width (Accuracy) by Modality and UI Mode. Lower values indicate tighter shot grouping and higher precision. Effective width (We) is calculated as 4.133 × SD of projected errors along the task axis, normalizing to a 4% error rate. White diamonds show mean values. Violin plots show the distribution shape, boxplots show quartiles.

Endpoint Accuracy Scatter Plot

Visualization of endpoint errors relative to target center. Each point represents one trial’s endpoint position.

Endpoint Accuracy Scatter Plot for Gaze Modality. Each point represents one trial’s endpoint position relative to the target center (0,0). The red dashed circle shows the approximate target size. Points closer to the center indicate better accuracy. Dotted lines indicate zero error in X and Y directions. Faceted by pressure condition.
Endpoint Error Distance (px) for Gaze Modality
ui_mode pressure N Mean_Error SD_Error Median_Error
static 1 285 12.29 8.09 10.41
adaptive 1 301 12.48 7.98 10.88

The “Midas Touch” Struggle

Target Re-entries measure how often the cursor drifted out of the target before selection.

Target Re-entries by Condition
modality ui_mode pressure Mean_Reentries SD_Reentries
hand static 1 0.00 0.00
hand adaptive 1 0.00 0.00
gaze static 1 1.99 0.91
gaze adaptive 1 2.11 0.91

Target Re-entries (Control Stability) by Modality and UI Mode. Re-entries count how often the cursor drifted out of the target before final selection, measuring control stability during the verification phase. Counts > 1 indicate slipping out of target. Lower values are better. White diamonds show mean values. Violin plots show the distribution shape, boxplots show quartiles.

8. Workload (NASA-TLX)

Subjective workload scores (lower is better).

NASA-TLX Workload Scores by Modality and UI Mode. Scores range from 0-100, where lower values indicate lower subjective workload. The six TLX scales (Mental Demand, Physical Demand, Temporal Demand, Performance, Effort, Frustration) are shown separately. White diamonds show mean values. Violin plots show the distribution shape, boxplots show quartiles.

Overall NASA-TLX Workload Score. Average across all 6 scales (Mental, Physical, Temporal, Performance, Effort, Frustration). Lower values indicate lower overall subjective workload. White diamonds show mean values. Violin plots show the distribution shape, boxplots show quartiles.

NASA-TLX Workload Components (Stacked Bar Chart). Total height represents overall workload, with each colored segment representing one of the six TLX scales (Mental Demand, Physical Demand, Temporal Demand, Performance, Effort, Frustration). Lower total height indicates lower overall subjective workload.

9. Learning Curves & Practice Effects

Research Question: How does performance change within each condition? Do learning rates differ by condition?

This section shows learning curves aligned by condition start (accounting for Williams counterbalancing). For block-level trends, see Section 12.

Learning Curve Data Summary by Condition
Modality UI Mode Pressure N Positions Mean RT (s) Mean Error Rate
Hand Static ON 27 1.194 0.0529
Hand Adaptive ON 27 1.161 0.0423
Gaze Static ON 27 1.330 0.1880
Gaze Adaptive ON 27 1.462 0.2037

Learning Curves: Movement Time Within Condition. Learning aligned by position within condition (accounting for counterbalancing). LOESS smoothing. Lower is better. Shaded regions show 95% CI.
Error Rate Summary by Condition
Modality UI Mode Pressure N Positions Mean Error Rate Min Error Rate Max Error Rate
Hand Static ON 27 5.29% 0.00% 14.29%
Hand Adaptive ON 27 4.23% 0.00% 14.29%
Gaze Static ON 27 18.80% 0.00% 46.15%
Gaze Adaptive ON 27 20.37% 7.14% 42.86%

Learning Curves: Error Rate Within Condition. Learning aligned by position within condition (accounting for counterbalancing). LOESS smoothing. Lower is better. Shaded regions show 95% CI.

Note: Data aligned by position within condition to account for Williams counterbalancing. For block-level trends, see Section 12: Block Order & Temporal Effects.


10. Movement Quality Metrics

Submovement Analysis

Research Question: Does adaptive UI reduce movement corrections? How do submovements relate to performance?

Submovements indicate intermittent control - fewer submovements suggest smoother, more ballistic movements.

⚠ submovement_count column not found in dataset.

Verification Time Analysis

Research Question: How much time is spent “stopping” vs. “moving”? Does adaptive UI reduce verification time?

Verification time represents the “precise stopping” phase, separate from the ballistic movement phase.


11. Error Patterns & Types

Research Question: What types of errors occur? Do error patterns differ by condition?

Error Type Distribution by Condition
modality ui_mode pressure err_type Count Total Percentage
hand static 1 miss 19 20 95.0
hand static 1 timeout 1 20 5.0
hand adaptive 1 miss 16 16 100.0
gaze static 1 slip 66 66 100.0
gaze adaptive 1 timeout 1 77 1.3
gaze adaptive 1 slip 76 77 98.7

Error Type Distribution. Breakdown of error types by condition. Stacked bars show absolute counts for each error type (Miss, Timeout, Slip).

Error Type Proportions. Relative distribution of error types. Stacked bars show percentage of each error type (Miss, Timeout, Slip) within each condition.

12. Block Order & Temporal Effects

Research Question: Are there order effects? Does performance improve or degrade over blocks?

Performance Across Blocks: Throughput. Throughput by block number. Higher is better. Shaded regions show ±1 SE.

Performance Across Blocks: Movement Time. Movement time by block number. Lower is better. Shaded regions show ±1 SE.
Block-Level Data Summary by Condition
Modality UI Mode Pressure N Blocks Mean Error Rate
Hand Static ON 6 5.56%
Hand Adaptive ON 6 4.28%
Gaze Static ON 5 19.44%
Gaze Adaptive ON 6 16.69%

Performance Across Blocks: Movement Time. Movement time by block number. Lower is better. Shaded regions show ±1 SE.

Performance Across Blocks: Error Rate. Error rate by block number. Lower is better. Shaded regions show ±1 SE.

13. Spatial Patterns & Heatmaps

Research Question: Are there spatial biases in performance? Do some screen regions show better/worse performance?

Performance by Target Position

Error Density Heatmap

Where do endpoint errors occur? Are there systematic spatial biases?


14. Adaptive UI Mechanism Analysis

Width Scaling (Target Size Adaptation)

Research Question: Does the adaptive UI dynamically change target sizes? How does width scaling relate to performance?

The adaptive UI may scale target widths based on performance. This section examines whether and how target sizes are adjusted.

⚠ Width scaling columns not found in dataset.

Alignment Gate Metrics

Research Question: If alignment gates are used, how do they affect performance? How often are false triggers detected?

Alignment gates may be used to ensure proper cursor alignment before selection. This section examines their usage and effectiveness.

⚠ Alignment gate columns not found in dataset.

Task Type Analysis

Research Question: Are there different task types (point vs. drag)? How does performance differ across task types?

If the experiment includes different task types, this section examines performance differences.

⚠ task_type column not found in dataset.

15. Gaze-Specific Analysis

Hover Time (Dwell Duration)

Research Question: How long do people hover before confirming? Does adaptive UI change dwell behavior?

For gaze modality, hover time represents the dwell duration before confirmation.

⚠ No valid hover time data available for gaze modality.

16. Summary & Conclusions

Key Findings Summary

Summary of Key Metrics by Condition
modality ui_mode Metric Mean SD
hand static Effective Width (px) 34.070 20.050
hand adaptive Effective Width (px) 35.470 22.510
gaze static Effective Width (px) 36.910 18.320
gaze adaptive Effective Width (px) 37.800 17.750
hand static Error Rate (%) 5.290 22.420
hand adaptive Error Rate (%) 4.230 20.160
gaze static Error Rate (%) 18.800 39.130
gaze adaptive Error Rate (%) 20.370 40.330
hand static Movement Time (s) 1.195 0.391
hand adaptive Movement Time (s) 1.160 0.296
gaze static Movement Time (s) 1.334 0.501
gaze adaptive Movement Time (s) 1.460 0.757
hand static Throughput (bits/s) 3.160 0.690
hand adaptive Throughput (bits/s) 3.190 0.730
gaze static Throughput (bits/s) 2.720 0.610
gaze adaptive Throughput (bits/s) 2.570 0.740

Data Quality Notes

  • Participants: 7
  • Valid Trials: 1306 (out of 1485 total experimental trials)
  • Exclusion Rate: 12% (due to errors, timeouts, or invalid RTs)
  • Trials per Participant: Mean = 186.6, Range = 164 - 209